Multigrain Parallelism for Eigenvalue Computations on Networks of Clusters

نویسندگان

  • James R. McCombs
  • Andreas Stathopoulos
چکیده

Clusters of workstations have become a cost-effective means of performing scientific computations. However, large network latencies, resource sharing, and heterogeneity found in networks of clusters and Grids can impede the performance of applications not specifically tailored for use in such environments. A typical example is the traditional fine grain implementations of Krylov-like iterative methods, a central component in many scientific applications. To exploit the potential of these environments, advances in networking technology must be complemented by advances in parallel algorithmic design. In this paper, we present an algorithmic technique that increases the granularity of parallel, block iterative methods by inducing additional work during the preconditioning (inexact solution) phase of the iteration. During this phase, each vector in the block is preconditioned by a different subgroup of processors, yielding a much coarser granularity. The rest of the method comprises a small portion of the total time and is still implemented in fine grain. We call this combination of fine and coarse grain parallelism multigrain. We apply this idea to the block Jacobi-Davidson eigensolver, and present experimental data that shows the significant reduction of latency effects on networks of clusters of roughly equal capacity and size. We conclude with a discussion on how multigrain can be applied dynamically based on runtime network performance monitoring.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel, multigrain iterative solvers for hiding network latencies on MPPs and networks of clusters

Parallel iterative solvers are often the only means of solving large linear systems and eigenproblems. However, these solvers are usually implemented in a fine grain manner and, when scaled to large numbers of processors on MPP’s, can incur significant performance penalties due to synchronization overheads. This problem is exacerbated in clusters of workstations (COWs) and SMPs that are interco...

متن کامل

Coarse-Grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler

This paper describes automatic coarse grain parallel processing on a shared memory multiprocessor system using a newly developed OpenMP backend of OSCAR multigrain parallelizing compiler for from single chip multiprocessor to a high performance multiprocessor and a heterogeneous supercomputer cluster. OSCAR multigrain parallelizing compiler exploits coarse grain task parallelism and near ne gra...

متن کامل

A multigrain Delaunay mesh generation method for multicore SMT-based architectures

Given the proliferation of layered, multicoreand SMT-based architectures, it is imperative to deploy and evaluate important, multi-level, scientific computing codes, such as meshing algorithms, on these systems. We focus on Parallel Constrained Delaunay Mesh (PCDM) generation. We exploit coarse-grain parallelism at the subdomain level, medium-grain at the cavity level and fine-grain at the elem...

متن کامل

Factory: An Object-Oriented Parallel Programming Substrate for Deep Multiprocessors

Recent advances in processor technology such as Simultaneous Multithreading (SMT) and Chip Multiprocessing (CMP) enable parallel processing on a single die. These processors are used as building blocks of shared-memory multiprocessor systems, or clusters of multiprocessors. New programming languages and tools are necessary due to the complexities introduced by systems with multigrain, multileve...

متن کامل

Runtime Support for Multigrain and Multiparadigm Parallelism

This paper presents a general methodology for implementing on clusters the runtime support for a two-level dependence-driven thread model, initially targeted to shared-memory multiprocessors. The general ideal is to exploit existing programming solutions for these architectures, like Software DSM (SWDSM) and Message Passing Interface. The management of the internal runtime system structures and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002